Generating Image Descriptions using Multilingual Data

نویسنده

  • Alan Jaffe
چکیده

In this paper we explore several neural network architectures for the WMT 2017 multimodal translation sub-task on multilingual image caption generation. The goal of the task is to generate image captions in German, using a training corpus of images with captions in both English and German. We explore several models which attempt to generate captions for both languages, ignoring the English output during evaluation. We compare the results to a baseline implementation which uses only the German captions for training and show significant improvement.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multi30K: Multilingual English-German Image Descriptions

We introduce the Multi30K dataset to stimulate multilingual multimodal research. Recent advances in image description have been demonstrated on Englishlanguage datasets almost exclusively, but image description should not be limited to English. This dataset extends the Flickr30K dataset with i) German translations created by professional translators over a subset of the English descriptions, an...

متن کامل

Toward Language Independent Methodology for Generating Artwork Descriptions - Exploring FrameNet Information

Today museums and other cultural heritage institutions are increasingly storing object descriptions using semantic web domain ontologies. To make this content accessible in a multilingual world, it will need to be conveyed in many languages, a language generation task which is domain specific and language dependent. This paper describes how semantic and syntactic information such as that provid...

متن کامل

Image Pivoting for Learning Multilingual Multimodal Representations

In this paper we propose a model to learn multimodal multilingual representations for matching images and sentences in different languages, with the aim of advancing multilingual versions of image search and image understanding. Our model learns a common representation for images and their descriptions in two different languages (which need not be parallel) by considering the image as a pivot b...

متن کامل

On generating coherent multilingual descriptions of museum objects from Semantic Web ontologies

During the last decade, there has been a shift from developing natural language generation systems to developing generic systems that are capable of producing natural language descriptions directly from Web ontologies. To make these descriptions coherent and accessible in different languages, a methodology is needed for identifying the general principles that would determine the distribution of...

متن کامل

Automatically generating multilingual, semantically enhanced, descriptions of digital audio and video objects on the Web

Every day, millions of new images, videos and audios are uploaded to the web. However, unlike text-based content, audio and video objects cannot be indexed by search engines. Thus, much valuable multimedia content stay unreachable for a great majority of online users. To overcome this problem we introduce a technique that automatically generates semantically enhanced descriptions of audio and v...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017